Picture for Boyi Li

Boyi Li

Accelerating Structured Chain-of-Thought in Autonomous Vehicles

Add code
Feb 02, 2026
Viaarxiv icon

Toward Cognitive Supersensing in Multimodal Large Language Model

Add code
Feb 02, 2026
Viaarxiv icon

Counterfactual VLA: Self-Reflective Vision-Language-Action Model with Adaptive Reasoning

Add code
Dec 30, 2025
Viaarxiv icon

Towards Efficient and Effective Multi-Camera Encoding for End-to-End Driving

Add code
Dec 12, 2025
Viaarxiv icon

FoundationMotion: Auto-Labeling and Reasoning about Spatial Movement in Videos

Add code
Dec 11, 2025
Figure 1 for FoundationMotion: Auto-Labeling and Reasoning about Spatial Movement in Videos
Figure 2 for FoundationMotion: Auto-Labeling and Reasoning about Spatial Movement in Videos
Figure 3 for FoundationMotion: Auto-Labeling and Reasoning about Spatial Movement in Videos
Figure 4 for FoundationMotion: Auto-Labeling and Reasoning about Spatial Movement in Videos
Viaarxiv icon

AccidentBench: Benchmarking Multimodal Understanding and Reasoning in Vehicle Accidents and Beyond

Add code
Sep 30, 2025
Viaarxiv icon

Bias in Gender Bias Benchmarks: How Spurious Features Distort Evaluation

Add code
Sep 09, 2025
Figure 1 for Bias in Gender Bias Benchmarks: How Spurious Features Distort Evaluation
Figure 2 for Bias in Gender Bias Benchmarks: How Spurious Features Distort Evaluation
Figure 3 for Bias in Gender Bias Benchmarks: How Spurious Features Distort Evaluation
Figure 4 for Bias in Gender Bias Benchmarks: How Spurious Features Distort Evaluation
Viaarxiv icon

MultiGen: Using Multimodal Generation in Simulation to Learn Multimodal Policies in Real

Add code
Jul 03, 2025
Viaarxiv icon

Describe Anything: Detailed Localized Image and Video Captioning

Add code
Apr 22, 2025
Viaarxiv icon

Scaling Vision Pre-Training to 4K Resolution

Add code
Mar 25, 2025
Viaarxiv icon